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ABSTRACT 

Q We consider an (M,N) wireless link, (M transmit antennas and N 

J receive antennas), impaired by additive white Gaussian noise. The 
transmitter, which is subject to a power constraint does not know the 
'I outcome of the random matrix channel which has a static and flat frequency 
m characteristic. It does know the channel statistics. The link operates at a limit 

□ on the probability of outage. We stratify diagonals in space-time to express a 
g message for efficient communication with limited receiver complexity. The 

□ special message arrangement enables the receiver to substantially mute self 
interference caused by multipath, and, despite the M-dimensional transmit 
signal, avoid an explosion of processing complexity in the spatial domain. 

We investigate examples in important downlink categories, showing 
that the message architecture can be extremely efficient For all (M,l) 
systems it is maximally efficient. At 10% outage, and with matrix Rayleigh 
channels with lOdB average SNR, we see that an (8,3) and (4,2) system can 
operate at over 95% and over 90% of Shannon capacity respectively. 
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1 INTRODUCTION 
1.1 OBJECTIVE 

We will be exploring (M,N) wireless communication links (M 
transmit antennas and N receive antennas), that are impaired by additive 
white Gaussian noise (AWGN). It is assumed that, when using such a 
channel, the transmitter is assumed not to know the spectrally flat N by M 
matrix transfer characteristic. The transmitter, which is subject to a power 
limit, transmits with equal power out of each of its M transmit antennas. We 
assume a "long burst" context. The view is that when the channel is used 
from time to time, its transfer characteristic holds constant for the 
communication burst, yet the channel can change significantly from one 
burst to the next The transmitter only knows the channel matrix statistics. 
For each burst, a large number of symbols are sent, permitting the standard 
infinite time horizon perspective common in information theory. 

The transmitter does know the. channel statistics, so it can infer the 
distribution of channel capacities it could attain if it were privy to the 



random channel outcomes. Say that the channel is allowed to be in an outage 
state for a small percentage, say X%, of its realizations. Without knowledge 
of the individual channels, the (M,N) link, can, in principle [1,2] operate at 
the X% capacity level for the remaining channel outcomes. In this paper the 
(M,N) system will be constrained by the stipulation that despite the spatially 
M-dimensional (M-D) transmit signal, an explosion of processing 
complexity in the spatial domain must be avoided by communicating with 
parallel one dimensional (1-D) codecs. As discussed in Section 2.0, other 
basic implementation concerns will be kept in mind as well. 

We will suggest a means for expressing signals so that the encoded 
message is disposed in space-time to enable the receiver to substantially 
mute self interference. After describing the communication architecture in 
Section 2.0, we analyze it in Sections 3.0 and 4.0 and then study it 
numerically for the case of a matrix Rayleigh channel in Section 5.0. We 
will see evidence, that such an architecture, when used in certain important 
downlink categories, like when M is substantially in excess of N, can 
operate at a specified outage level with near maximum efficiency. 

The study of (MJN) systems is a very active area of research: [3-11] 
are sample references. Additional related papers are cited as we proceed. In 
contrast to many of the references, we are primarily concerned with a 
message's architectural superstructure so that the specific coding and 
modulation used in the fundamental 1-D space-time building blocks that we 
will introduce will not be our direct concern. A few of the references do deal 
with architectural superstructures': [12] is concerned with general (M,l), 
while [13-14] and [15], investigate (2,1), and (4,1) respectively. 



1 .2 THE VECTOR CHANNEL 

We take a baseband view of an (M,N) wireless communication link. 
The transmitted M-D vector signal has components denoted srft), i = 1,...M 
that are nominally complex, statistically independent white Gaussian signals 
of equal power, P/M. So the total radiated power is P. These M components 
of a vector process, s(t), are transmitted over a noisy, spectrally flat, N by M 
complex matrix channel, G, that can cause the M transmitted signal 
components to interfere with each other. This self-interfering means of 
communication, is described by the following vector equation for the 
received N-D signal, r(t) in terms of s(t) 

r(0 = G5(f)+v(0. (1.1) 

The N-D vector, v(t) , represents the complex additive white Gaussian noise 
(AWGN) impairment We assume that v(t ) is both temporally and spatially 
white. For convenience we will often take time to be discrete, with each 
clock tick corresponding to the time it takes for exactly one coded vector 
symbol to be received (or alternately, sent). 

When convenient for simplifying our analysis we can redefine the 
vectors in (1. 1) to be normalized. Then, each vector component is divided by 
the standard deviation of an additive noise component, <7, so that all the 
noise variances become unity and the normalized signal power radiated from 

each transmit antenna becomes PliM-a 1 ) instead of P/M. 

The signal s(t) has spatially and temporally white Gaussian 
components representing the limit of a bandwidth efficient, forward error 
protected, encoded vector signal. We will architect the space-time 



disposition of various basic building blocks that are put together to compose 
this white Gaussian vector process. 

The channel matrix is perfectly known at the receiver: it takes 
vanishingly small rate to probe the channel, so that the receiver learns it with 
arbitrary accuracy without detracting from capacity. The Shannon capacity 
of the channel described by equation (1.1) is given by 

C = log 2 det[/j V +(P/(<7 2 A/))GG t ]. (1.2) 

In this equation, I N is the N by N identity matrix and det means determinant. 
The formula [1,2,8-10,16,17], termed the LogDet capacity formula can be 
derived from fundamental information theory considerations, as, e.g., appear 
in [18]. It holds when the transmitter only knows the value C, and not the 
actual entries in G. Then, with a suitably encoded signal, s, the capacity, C, 
can be "attained". More precisely, in the limit of a sequence of progressively 
more powerful codes, error free transmission at rate C can be approached to 
within an arbitrarily small deficit. 

The LogDet capacity will be our target. Our primary interest is in 
exploring how close we can come in important downlink situations when 
additional demands are placed on the communication architecture. 

2 STRATIFIED LAYERING TO REDUCE INTERFERENCE 

2.1 DIAGONAL LAYERING OF SPACE-TIME 

In our architectural superstructure for a received message, space is 
discrete with M components, m = 1, 2, ... M, corresponding to labeling of 



the transmit antennas. With each tick of the clock, an M-D encoded vector 
symbol is received (sent) consistent with equation (1.1). For the time it takes 
to receive (send) a coded message, we can view space-time as a rectangle. 
Later, in our information theory analyses, we will be interested in the limit 
of messages of infinite duration. 

Reference [1] offered a look at a means of communication processing 
that used the concept of imaginary "diagonals" passing downward and to the 
right through space-time. Communication of this type, which effectively 
involves a spatially 1-D channel has been described as D-BLAST for 
diagonal Bell Labs layered space-time. The 1-D channel is an AWGN 
channel, but of the type whose noise power changes with time [19,20]. We 
will briefly highlight some of the ideas from [1] which we will then replace 
by a more refined way to utilize space-time. 

Examples of diagonals are shown in Figure 1 for the case when there 
are five transmit antennas. These diagonals arise in describing the space- 
time disposition of 1-D encoded message constituents that are cycled over 
the antennas that they radiate from The transmitter, not knowing the 
channel, does not know which transmitters enjoy stronger propagation paths 
over to the receive array and which suffer weaker paths, so, by cycling, the 
signal is hedged. These diagonals are referred to as diagonal layers (layers of 
space-time). Importantly, these diagonals offer a means of muting mutual 
interference, while enabling 1-D processing. As explained in the reference, 
a diagonal layer can amount to one coded block with a downward and to the 
right time sense. In the detection' process these diagonal layers are "peeled 
off* one by one, left to the right At any time, signal constituents associated 
with previously peeled diagonals have been removed as a source of 



interference in detecting bits in subsequent diagonals. Interference from 



constituents in diagonals that will be peeled off later are muted using 
minimum mean square error processing, and, if the received signal to noise 
ratio is large enough, zero forcing is an option that is essentially as good. 

Figure 1 : Two Examples of Diagonal Layering of Space-Time for the Same Message 
Duration. Space is composed of five transmit antennas. Every fifth diagonal layer is 
shown in both figures, including the first and last diagonal layer of a message. In the 
lower figure each diagonal has a longer time duration and therefore wastes more 
space-time both at the start and end of an encoded message. 




The processing highlighted above can be considered to be a version of 
decision feedback in space-time. As explained in reference [1], in many 
interesting cases C can be approached arbitrarily closely with 1-D 
processing in circumstances when the transmit array does not know the 
channel matrix. We mention in passing that recently this was shown to be 
true for all N by M matrix channels, [21]. This was done by combining D- 
BLAST analysis with a method used in references dealing with related cases 
where key information is known to the transmitters [22-24]. 

Figure 1 shows two of many possibilities for diagonalization. There 
are two practical reasons for preferring the short diagonal choice over the 



long diagonal choice. At the start of a message, [1,2] and once again at the 
end, there is wasted space-time and we see the waste is less for the case of 
shorter diagonals. Also, in practice, the channel coherence time can be a 
concern if the duration of a coded block begins to approach a significant 
fraction of the coherence time. This concern arises because the diagonal 
duration approaching the coherence time undermines the assumption of a 
constant channel during a message. We stress that these two concerns 
involve practical, not theoretical, issues. They are inconsequential when one 
is making the idealizing assumptions that the channel is not time varying and 
the message is infinitely long. 

For the two practical reasons just mentioned, it would appear that 
coding over a short diagonal is preferred. However, a countervailing 
practical issue arises. In practice one cannot use too short a diagonal as that 
does not allow adequate time to support a powerful bandwidth efficient code 
with forward error correcting capabilities. Also, the signal to noise ratio 
changes as one moves down the diagonal and that makes for a nonstandard 
coding environment We will confront these issues. 

2.2 REFINED STRATIFIED DIAGONALS TO MUTE INTERFERENCE 

Toward overcoming these practical difficulties, we introduce a 
method of communication drawing on a more refined view of space-time. 
We call the technique SD-BLAST for Stratified Diagonal BLAST. 

As with D-BLAST, the method will involve cycling spatially 1-D 
coded/modulated signal constituents periodically over the M transmit 
antennas. In accord with this cycling we take the integer labels for transmit 
antennas to be the integers mod (M). It will also be convenient to view time, 
which is also finite, as modular. Since, in all of our analysis, we are 



interested in the limit of long messages, it will be harmless to assume that 
the number of vector symbols in a message, T, is a multiple of M. So the 
space-time occupied by a message is viewed as the 2-D surface of a torus as 
shown in Figure 2. In place of successive diagonals layering space-time as in 
[1], we think instead of layering toroidal space-time with M helices perfectly 
winding around and covering the torus as noted at the bottom of the figure. 
In effect a long diagonal has been replaced by a helix that winds T/M times 
around the torus. The M congruent helices are offset from each other in time, 
by the time to send one symbol. These M helices also have an internal n-fold 
stratified spatial structure as we next explain. 



FIGURE 2: VIEW OF SPACE-TIME AT RECEIVER 




White noise 

Each of the n plies is helically (diagonally) stratified. Loops taken by two of the five 
helices in the top ply are shown below for the case of five transmit antennas: 

Symbol duration | | 



7773 helical (saw tooth) 
directed line expresses 
the time arrow tor 
processing time. 




Symbol receipt time 
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It will be useful to consider the toroidal space-time surface as plied so 
that the surface is actually n+1 nested, infinitely thin, tori. We say that we 
have n+1 plies because there are n signal plies and one AWGN ply. A 
message will be expressed as M-n 1-D coded blocks and each block occupies 
exactly one out of the M helices in one of the n signal plies. The M helices 
which, taken collectively, perfectly wind around and pave a toroidal ply are 
called strata. We can think of the plies as energized by received signal 
constituents, except that the last innermost ply is energized by AWGN. A 
received signal is composed of M diagonal layers and each diagonal layer is 
composed of n helices, or, what is the same, n strata. 

Figure 3 shows, at a high level, the way that a 5-D transmitted 
message corresponding to the space-time structure of Figure 2 is composed. 
First, a primitive bit stream is demultiplexed into five separate streams of 
equal rate. Then each of these streams is further demultiplexed into four 
substreams that are independently encoded. Later we will see that there is 
flexibility in choosing rates for these substreams. We will assume each of 
these substreams is modulated and encoded to compose four subsignals of 
equal power but different rates. The equal power assumption is to streamline 
the mathematical proof of our main result, reporting that in many cases, 
close to LogDet capacity can attained with 1-D codes. 

For codes having a time sense, by maximizing the number of times a 
helix winds around the torus, all the strong path - compensates - weak path 
opportunities can be quickly (according to the processing clock) encountered 
and accomplished. This hastens the bit decision process. As already noted, 
diagonalization accomplishes hedging. As we shall see, the stratification 
feature enables significant muting of interference into any one of the n plies. 
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Figure 4 shows five diagonal layers, each with a different color. 
Furthermore, these diagonals are themselves composed of diagonal layers 
having the same SE directed slope. These layers within the diagonal layers 
are the strata. Four strata per diagonal layer are shown. Within each of the 
five diagonal layers, each of the four strata has a different pattern. So a total 
of twenty distinct strata are depicted in the Figure 4. The strata within the 
AWGN ply are not shown. 

At this point we suggest thinking of each basic strata constituent (i.e., 
smallest rectangular block) coordinatized by its color and pattern at time, t, 
as merely an accommodation in space-time for a basic signal constituent. As 
noted in Figure 2, will also keep time with a second separate clock called the 
processing clock. 

3 DETECTION 

In this section we will give a high level somewhat qualitative 
description of the form of processing that is involved in deciding bits. Then, 
in the next section, we give an information theory analysis showing that the 
stratified toroidal structure can, in certain important cases "attain" or come 
very close to "attaining" LogDet performance. 

At the receive array, the encoded bits are detected as follows. Each 
ply is to be thought of as an annulus like the ring of an onion slice. See 
Figure 5. The outer brick patterned ply, is detected first Detection is then 
followed by removal of the signals in this ply as a source of interference. 
Prior to removal, this outermost ply, which like all the four plies hold five 
strata, is said to be exposed. In the process of detecting the bits in the strata 
in this exposed ply, each brick patterned stratum can be, but need not be, 
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processed simultaneously and separately. After all the bits in five strata that 
make up the outside ply are detected error free, the interference 
corresponding to this ply is then subtracted from the received signal vector. 
The diamond ply is then exposed. The bits in the five diamond strata are 
then detected and then the diamond strata signals are cancelled exposing the 
checkered ply, and so on, until finally the plaid patterned ply is exposed and 
the bits in its five strata detected. 




FIGURE 5: Peeling away of successive strata from outside-in in the one of 
colored layers. Here a layer is depicted as an "onion" with five "rings". In the 
sequence, peeling away another outer ring corresponds to removal of another 
stratum of interference. Any stratum is interfered with by ail the other strata on the 
same ply as well by the strata that make up relatively inner plies. The peeling 
process can be simultaneously conducted in all five of the colored strata in a ply. 

In preparing an exposed ply for detection, the N-D received vector, 
r(t), that was received at time t, has already been processed achieving perfect 
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removal of the relatively outer ply. So each inside ply is free of interference 
from all outside plies. It may be helpful to think of a sequence of processed 
received vectors bootstrapped with r(t). So, with each major step in the 
processing sequence, the original received vector process r(t) sheds another 
ply's worth of interference 

r(f) = r 1 (?) -> r 2 (t ) -> r 3 (f) -> r 4 (f) (3.1) 

where each of the four vectors is an N-D function of time defined over the 
message duration. Adding another level of refinement, we point out that the 
detection process actually includes five different copies of (3.1), one is used 
for each layer (color) in the example. Then a second superscript, m, is 
implicit for (3.1). 

Each vector is then collapsed to a scalar using a weight vector, 
defined (up to an arbitrary complex scalar multiple), to maximize the signal 
to noise plus interference ratio (SINR) associated with this decision statistic. 
Maximization of SINR is a well known process, see, e.g., reference [2]. 

4 NEAR LogDet CAPACITY IN CERTAIN CASES 

Next we write a mathematical formula for the bit rate that each 
stratum is capable of supporting if a genie informs the transmitter of the 
individual stratum capacities. Then we will investigate the large n 
asymptotics of this and related capacity formulas. The formulas that we 
derive will be useful in determining what can be achieved when the genie is 
put back in the bottle, that is, when the transmitter is not informed of the 
stratum capacities, and that is our main interest. 

14 



4.1 GENIE INFORMED STRATA CAPACITIES 

We use n for the number of signal strata per diagonal layer. We derive 
the formula for the ultimate error free bit rate that each stratum is capable of 
supporting when it powered with power P/(Mn). We investigate the received 
signal constituent, s^, , on the t strata of the m ,h component at time, t, and 

how it is impaired. For simplicity we proceed explicitly using only the first 
two coordinates. We do so for when the m' h strata exposed on the i th ply is to 
be detected following the detection and removal of interference from the 
strata on plies I, 2, ... Along with AWGN, this constituent, s^, is 
impaired by all of the simultaneously transmitted strata with equal or higher 
indices that are not yet detected. The constituents of these interfering strata 
are {s m . t , i < I < n, 1 < rri < M , rri * m when I = i} . 

For analytical convenience we express the N by M channel matrix, G, 
in terms of its column vectors, so 

G~{g lt g„-,g M ) (4.1) 

where g m is an N-D vector corresponding to the m ,h transmit antenna. 
Drawing on (1.1), we see the vector expression involving the received scalar 
signal, s^, along with its additive impairment is therefore expressed as 
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(4.2) 



The (mi) superscript means that the i' h strata of the m' h component is 

omitted in the sum, because, in (4.2), the scalar s mi is signal, not interference. 
We are working here with the normalized form of equation (1.1), so that the 
N-D noise vector, v, has complex i.i.d. Normal (0,1) components. 

In equation (4.2), the impairment is the N-D vector, is defined by 



l=i 



'11 



V s Ml J 



+ v. 



(4.3) 



For each value of m = l,2,--,Af the variance-covariance matrix of this 
impairment of is denoted by In the next step we spatially whiten 
this vector impairment by multiplying the expression in (4.3) by the positive 
definite matrix 2^ /2 . We get the following expression for the spatially 
whitened i -th stratum of the m -th component 

Sr^- + ^ (m = l,2,".,M) (4.4) 
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Each one of the vectors in the sequence of N-D vectors, has 

complex Normal(0,l) components that are statistically independent of each 
other. Indeed, for each value of m in a stratum independent noise is 
encountered. For a (1,N) system, where the scalar signal 5 TO . is transmitted 
and the N-D noisy received signal vector is described by (4.4), we know 
from the LogDet formula for capacity that the capacity is 

Cmi =log 2 [l+l^^ m | 2 (P/«7 2 Mn))] 

(4.5) 



where the heavy dot denotes scalar product. For this (1,N) system, maximum 
ratio combining of the received vector components is a step in the 
processing for "achieving" Shannon capacity. 

Next, employing E for expectation, we compute that 



M ( n \ 

; =1 V (»•■ J 



+ v 



=/ + 



j.i \ n 



P \ 



<J Z M 



J 



<T 2 M 



)n 



(4.6) 



We will also need the matrix identity 



gg' (4.7) 
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Later we will need the large n asymptotic form of (4.6). Consequently, 
for later use, we introduce z-i/n and write 



Z fc = I + {PK<J 2 M))GG t {l-z) + o(l/n). 



(4.8) 



Therefore, the signal to interference plus noise (SINR) of the i' h stratum of 
the m' h component is explicitly 



SINK = £ [h + (P/(<r 2 M))G& (1 - z))' 1 g m 



<fnM \n 



(4.9) 



As time progresses, the i* stratum in any diagonal layer experiences 
an SINR that is periodic with period M. The capacity added by the 1 th 
stratum in any one diagonal layer, say, the d h , (1 <d< M), is then obtained 
by averaging over the capacity contributions from the M transmit antennas. 
This capacity, Cf, which by symmetry does not depend on d, is 



1 M 

M m»l 



1 + 



nM(7 2 



(4.10) 



This equation, which draws on formula (4.6), is a key equation that will be 
used for computations in Section 5. 
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4.2 ASYMPTOTIC FORM OF GENIE AIDED STRATA CAPACITIES 

Next we derive the large n asymptotic form of the strata capacities, 
still with the tentative assumption is that the transmitter is made privy to 
these capacities. First we rewrite the previous equation as follows 



1 M 

M m-1 



Mcr 2 ) ) 6m nMa 2 



TO 
+ o - 



(4.11) 



For the large n asymptote, we employ the small s approximation for 
log2(l+E) so that differential capacity contribution of the i th stratum is 
expressed 



<r z n M -ln2~ 



lgl{l* -+(F/V ? M))GG t (l- z))' 1 g m . (4.12) 



In summing the incremental capacities, C d , from t : equal one to n , and then 
taking the large n limit, we get an integral over [0,1]. For an integral of a 

function of the form /(1-z) over [0,1], if we substitute £' = l-z, then 

i i 

\ f(\-z)dz- jf(()d£. So, in the large n limit we proceed with £ 

0 0 

replacing 1-z and with d£ associated with 1/n. Therefore, the capacity, 
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n 

XC?> becomes the following integral 



^t c - = ^h^U iKi.WmWtjic. (4.13) 

Now, the detection of M such diagonal layers is executed in parallel 
so that the total capacity C is obtained by multiplying the right hand side of 
(4.13) by M. Employing tr for trace, after some minor linear algebra, the 
total capacity is seen to be 

C=s ^M. ]n2 tr \ G '^ + (P/(° 2 M))GGXyGd£. (4.14) 
Using the singular value decomposition, [25], we write G as a triple product 

G = UAV\ (4.15) 

Letting MIN = Min{M,N}, A is an M by N matrix that is all zeroes except 
that its NW corner is described as occupied by a MIN by MIN diagonal 
matrix with jj entry X } . V and U are unitary matrices sized M by M and N 
by N respectively. So we have 
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c= 



<7 2 MIn2 



tr\ )(VAV 1 )(uW + (P/((T 2 M))UAA'UX)' i UAV'dA (4-16) 



cT-A/-ln2 



•tr< 



where the matrix with inverted diagonal entries that is depicted above is a 
diagonal matrix where for j > MIN the X. } vanish. The trace of a square 

matrix is unitarily invariant, so upon carrying out the simple integration we 
get 



Mm 



f 



c=S> g2 



1 + 



2\ 



<7 2 M 



(4.17) 



Equation (4. 17) can be expressed in the more elaborate form 



C = log 2 det 



(7 2 M 

= log 2 det[/ y + (P/(<r 2 • M)) ■ GG f ]. 



(4.18) 



This is the same as the right hand side of equation (1.2). We have proved, 
that by knowing the strata rates, and using the communication means that we 
have described, in the limit of large n, LogDet capacity is "attained". 

While our interest is when the strata rates are not told to the 
transmitter, we continue for now with the genie still informing the 
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transmitter of the ply rates. It will be useful to express the accumulation of 
capacity with the progressive peeling away of plies. We particularly want to 
do this for the asymptote of a large number of plies. We take i to be a fixed 
fraction, z, of n as n -»<». From equation (4.12) we see that the differential 



capacity added by the i* ply is 



1 ym P\A } f z/(<7 z M) 
X&l 2 * 1 * \ + {p\X.\*l(<j>M)y\-z) (4 ' 19) 



Asymptotically for large n, the capacity accumulated through the normalized 
= (i/n)* ply is obtained by integrating (4.19) from 0 to Z to give 



l + _ P\X.fZ/(<T 2 M) 



l + (P\Ajf /«r 2 M))(l-Z) 
As expected, setting Z = 1 gives equation (4.17). 



(4.20) 



4.3 PUTTING THE GENIE BACK IN THE BOTTLE WHERE IT BELONGS 
As is well known, [16,26,27], a matrix channel possesses generalized 
eigenmodes. To access these noninterfering modes the transmitter needs to 
know the channel matrix. Armed with this knowledge, the transmitter can 
spatially water pour its available power over the modes to obtain a superior 
capacity than if it transmits equal power from each transmitter. Each of the 
summands in this equation (4.17) is referred to as an eigenrate. We are 
interested in the eigenrates, the rates that the (generalized) eigenmodes can 
support when transmitting with power P/M on each of the transmit antennas. 
When M > N the channel has N eigenmodes, the interpretation of (4.17) is 
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that they are being blindly accessed by the transmitter, each driven with 
power P/M. The wasted power, (N - M)P/M, as well as the inability to water 
pour, is the price of channel blind operation. 

We see that if all the genie does is inform the transmitter of the 
eigenrates then LogDet capacity can be approached in the limit of an infinite 
number of strata per diagonal layer. The transmitter need not know how to 
access the eigenmodes. When there is only one eigenmode, for example 
(M,l), then knowledge of C is equivalent to knowledge of the eigenrates and 
the genie is not necessary. (In (1,N), the single eigenmode is also known 
without a genie, but we are focusing on M > N.) 

We will now give a general procedure, that can be carried out with a 
Monte Carlo method, for computing a lower bound on what this stratified 
space-time architecture can achieve when the genie is back in the bottle. 
The method is based on the statistical characterization of the channel. We 
stress that the transmitter is assumed privy to that and nothing more. In the 
next section we apply the procedure to an ensemble of random matrix 
Rayleigh channels and establish its effectiveness in some important cases. 

We discuss how to operate at a typically small, say X%, capacity 
outage for an arbitrary random channel ensemble. Since the transmitter does 
not know the set of n bit rates per ply, it needs to make an educated guess to 
achieve a high value of the X%-tile outage capacity. The guessed ply rates 
must be low enough so that all n guesses are successful for at least (100- 
X)% of the channels. 

Choose the guessed rates as those associated with any channel at an 
outage percentile Y% < X% of plies of a hypothetical channel population in 
which the ply rates are known. Enforce these somewhat less ambitious rates 
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for each channel in a population where the rates are unknown and look at the 
percentage of the channels that have at least one of its n plies in violation of 
the demanded rates. Iterate this calculation by perturbing the starting outage 
percentile (all lower than X%) computed from the bit rates known case until 
an acceptably slightly less than X% violation free count occurs with rates 
unknown. This assures the desired X%-tile outage capacity is met for this 
guessing procedure. While the outage capacity is reduced compared to the 
genie assisted case, in the next section we will see examples, where, it is fan- 
to say, it can come close. 

5 NUMERICAL EXAMPLES FOR MATRIX RAYLEIGH CHANNEL 

Next we report some examples for the ideal matrix Rayleigh channel. 
We will see cases where the transmitter is not all that much in the dark as to 
what bit rate to use for each stratum, or equivalently, for each ply. If M » N 
the M eigenrates substantially become hardened, [1,28-31], as they do when 
both M and N get large and M is a fixed fraction or N. In all cases, whether 
there is hardening or not, the iterative method of the last section is 
applicable. 

We include results for a sampling of (M,N) systems using 10% 
outage, leaving comprehensive numerical studies for the future. The first 
two examples, (8,1) and (8,3) pertain to the important downlink case where 
the end user has few antennas compared to the base(s). A (4,2) and a (2,2) 
case are also reported. Heavy use will be made of the formulas derived in 
the last section. Since in all four examples M > N, we have that MIN = N. 

The graphs that we present display 10% outage channel capacity 
versus average received signal to noise ratio (SNR). The average received 
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SNR, p, (pdB = lOlogiop) is calibrated as follows. If one makes a test 
measurement transmitting all the available power, P, out of any of the M 
transmit antennas, say the m* and makes repeated statistically independent 
channel measurements at any receive antenna, say the jth, then p is given by 
p = (P/o^-ElGjnJ 2 . The expectation averages over all the statistically 

independent instantiations of the entry Gj m . Note, p is independent of the 
index pair;'m. In the SD-BLAST examples power P/M is transmitted from 
each of M antennas in the transmit array. All of the examples have SNRs in 
the range of pdB = -6 dB to +24dB with 3dB steps. 

Because of interest in V-BLAST, [32], which works with codes 
designed for a standard AWGN environment, we also include in our 
examples some comparative ultimately encoded V-BLAST curves. In the 
examples the V-BLAST system is always based on maximum SINR instead 
of zero forcing. It must be stressed that V-BLAST was designed for 
situations where N > M, so it cannot be expected to exhibit strong 
performance in our examples. With V-BLAST a smaller number of 
transmitter antennas than M can give superior performance than using all M 
available. Therefore, in all examples we optimize the number used. 

SD-BLAST EXAMPLE: (8,1) 

The SD-BLAST graphs for 1, 2, 4, 8, 16, 32 and 64 strata per layer are 
shown in Figure 6A for the (8,1) example (as they are for the next two 
examples as well). For an (8,1) system the SD-BLAST graphs evidence D- 
BLAST, or what is the same, LogDet, performance as an upper envelope as 
predicted by the theory. Since the transmitter knows the channel statistics it 
knows the lower 10%-tile of |Xi| 2 . So, in this case, the single eigenrate at the 
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10% outage level is known to the transmitter. The bit rates per ply in the 
limit of large n are thereby known. Moreover, the guessing procedure was so 
effective here that the violation counts were negligible for all the n shown. 
So in this case the ply bit rates were essentially known. (This will not be the 
case in the subsequent examples.) 

The lower double dash curve that appears in each figure expresses the 
performance of the upper limit of encoded V-BLAST systems. As expected, 
V-BLAST exhibits inferior performance. 



Figure 6A: (8,1) SD-BLAST, CAPACITY vs AVERAGE SNR 
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In this example, as well as in the three that follow, the n signal plies 
were equally energized. We mention in passing that one can get improved 
capacity at any given outage %-tile by optimizing the energy distribution 
over the n plies. To illustrate this, in this case, for n = 2 we optimized the 
energy distributed over the two plies and found considerable capacity 
improvement as Figure 6A shows with the bold curve. Of course, one can 
not improve over the upper envelope of LogDet performance, however, with 
(M,l) systems one can look to improve convergence to LogDet with 
increasing n by optimizing the distribution of the n signal energies. 

SD-BLAST EXAMPLE: (8,3) 

In this (8,3) case with M > N > 1, the problem is more difficult then 
the previous one since the transmitter does not know the set of n bit rates per 
strata to use in the limit of large n. The transmitter's guessing of the ply 
rates is now crucial. We cannot expect to get the same 10%-tiles as if the ply 
rates were known. The aforementioned guessing technique was used to 
produce the 10%-tile capacities. In Figure 6B, the degradation of these 10%- 
tiles from the 10%-tiles for when the n ply rates are known is shown and the 
degradations are just about noticeable. 

The V-BLAST curves exhibit a peaking of the derivative. This is 
because (M,N) V-BLAST, requires, that, for each SNR reported, the 
optimum number of transmitters less than or equal to M was used and this 
optimum can change with the abscissa value. Such changes in the number of 
transmitters also occur in the other three examples, but in Figure 6A there 
was no noticeable peaking of the slope. 
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Figure 6B: (8,3) SD-BLAST, CAPACITY vs AVERAGE SNR 

M=8, H=3 
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SD-BLAST EXAMPLE: (4,2) 

In the third example, a (4,2) case depicted in Figure 6C, the excess of 
M over N is not as great as in the previous two examples. Therefore, as 
expected, the relative performance is not quite as good as the previous two 
cases. However, because of substantial eigenrate hardening, the deficit from 
known bit rates is still significantly less than one bps/Hz. 
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Figure 6C: (4,2) SD-BLAST, CAPACITY vs AVERAGE SNR 
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SD-BLAST EXAMPLE: (2,2) 

The fourth example, a (2,2) case in shown in Figure 6D. It is the most 
challenging of the four cases. To avoid overcrowding, only the eight and 
sixty four strata SD-BLAST curves are shown. There is a significant 
departure between what is possible with the ply bit rates known versus 
unknown. A transmit diversity scheme from reference [15], where the 
architectural superstructure requires two instances of 1-D encoding each 
with 2-D decoding is also shown. Even with eight strata SD-BLAST is seen 
to be competitive. 
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Figure 6D1: (2,2) SD-BLAST, CAPACITY vs AVERAGE SNR 

M=2, N=2 
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The curves we have shown may make one wonder if the upper 
envelope of our genie unassisted examples is LogDet. The theory tells us 
that this cannot be for the (8,3), (4,2) and (2,2) examples. We looked for 
numerical confirmation of a deficit for this (2,2) example, at 24dB. We 
obtained results for a very large number of strata and indeed we noticed 
saturation below LogDet as the last Figure 6D2 shows. 
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Figure 6D2: (2,2) SD-BLAST, CAPACITY AT 24dB vs No. OF STRATA. 
EXPECTED SATURATION BELOW D-BLAST ASYMPTOTE 



11.5 



11 



M=2, N=2, p =24 dB 




as 1 



D-BLAST 

..A 



SD-BLAST: known rates 



SD-BLAST: inknown rates 



200 400 600 800 

Mmber of Strata (n) 



1000 1200 



6 BRIEF COMMENTS ON VARIATIONS INCLUDING CDMA 

There are straightforward variations on the covering of plies with 
helices. E.g., by perturbing the long message length, T, slightly so that T and 
M are relatively prime, one can arrange to uniformly wrap a ply with just 
one (approximately M times longer) helix rather than M parallel helices. 
This is meaningful when the code used has a time sense. In the course of 
detection processing along the single longer helix, when repeating the time 
of symbol receipt coordinate, merges must have been forced on the 
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simultaneously received, but previously processed helical section(s). In fact 
the entire message can be just one long helix by dropping down to the next 
inner ply to continue the ply wrapping after an outer ply is completely 
wrapped. In this case the received signal manifold can be taken to be like the 
snake who swallows his own tail along the lines depicted in [33]. 

As the number of plies increase, the variation of the previous two 
paragraphs are easily shown to be of secondary importance in terms of the 
capacity improvement afforded relative to the M-fold simultaneous detection 
featured for simplicity in Section 4. 

One can replace the spatially 1-D codec constraint with, say, a 
spatially k-D codec constraint (k < M and k dividing M ). The 1-D choice in 
this report was the extreme low dimensional choice used for illustrative 
purposes. 

The SD-BLAST approach also applies to frequency selective systems. 
For example, one can look to use a space-time message superstructure in 
accordance with OFDM features when communicating over a broadband 
frequency selective channel. A possible architecture is a closed loop chain 
of plied tori, one plied torus per subband and one or more per decorrelation 
band. Coding is over subbands. The chain link order would express the order 
in which the receiver's processing clock is associated with passage from one 
subband to the next The standard frequency axis ordering is not to be 
respected since it is essential to pass very quickly through all decorrelated 
subbands and all antennas so that the stronger signal subbands can 
compensate weaker signal subbands at maximum rate. Ideally, the dwell on 
single strata would be only one . symbol duration, then on to the 
corresponding stratum in the next torus in the chain. Thereby, bits 
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experience maximum space/frequency diversity and bit decisions are not 
held unnecessarily in abeyance. 

With the exception of the last paragraph, the focus of this paper has 
been on any single one of many spectrally disjoint narrowband users. The 
numerical results presented in the last section go over immediately for any 
one of the orthogonal users in an idealized CDMA system based on 
orthogonal codes and operating over a flat channel. It is also possible to 
numerically assess performance of SD-BLAST CDMA systems using PN 
codes. Moreover, the SD-BLAST approach also applies to frequency 
selective CDMA systems. 

7 SD-BLAST: SUMMARY AND FURTHER WORK 

SD-BLAST is an architectural superstructure for an (M,N) 
communication system. Signal constituents are received symmetrically 
arranged in space-time in nested toroidal plies that are helically wound. 
They are so arranged to enable the receiver to substantially mute inevitable 
mutual interference of the M simultaneous transmissions in a multipath 
environment. At the same time SD-BLAST serves to enable implementation 
with 1-D codecs. It helps avoids practical problems associated with an 
earlier D-BLAST. Namely, it helps to avoid using long diagonals which 
force waste of the space-time resource and make it problematic to pack 
enough coded blocks into a message of limited duration. 

The scope of this report does not include the lower level issue of 1-D 
codec design for the coded block occupying a stratum. With SD-BLAST, 
within a stratum, an SINR of period M is experienced. Reference [34,35] 
discusses coding for systems with periodically varying SNRs as arise in 
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ADSL systems. The authors stress that the signalling alphabet must be 
sufficiently rich to take advantage of the better SNRs. We note that, with 
SD-BLAST, with more and more plies, even a code using a binary 
constellation meets this important requirement because of the way that the 
bps/ply decrease with increasing n. Furthermore, the period M cycling at the 
transmitter has stronger transmissions compensating weaker ones at a 
maximum pace. This concentration of space diversity with the passage of 
processing time promotes quick delivery of bit decisions. 

An information theory analysis showed that the SD-BLAST 
communication architecture can often get close to LogDet performance. 
Particularly we highlighted important downlink cases, mostly with M 
significantly greater than N, i.e., where the receiver has few antennas 
compared to the base(s). In illustrative (8,1), (8,3), (4,2) and (2,2) examples 
involving matrix Rayleigh channels we quantified the extent to which one 
can expect to get close to LogDet performance. Under the assumption of 
equal energy per stratum, Figures 6A-D quantified how much stratification 
was needed to support a particular capacity level at 10% outage for SNRs 
spanning -6dB to +24dB. 

The simulation results reported in the literature, [36-39] (TURBO- 
BLAST architecture), [22] (Turbo Space-Time Architecture) and [40], 
(Threaded Space-Time Architecture) involving BLAST in conjunction with 
iterative receivers helped provide motivation for the work reported in our 
paper. These architectures do not include the element of stratification that 
we have explored that enables us to show that in some cases we can "attain" 
or come close to "a ttainin g" LogDet performance in the limit of infinitely 
many strata. 
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Moreover, the SD-BLAST approach, theoretically analyzed here in 
the limit of zero error rate, was not iterative while the approaches in 
references of the previous paragraph were iterative. Of course, in practice, 
the SD-BLAST structure would operate under a specific bit or block error 
rate requirement. Using simulation, it would be interesting to quantify any 
advantages to including some iterative aspects of interstrata processing in 
SD-BLAST. Thus, one could see if it is worthwhile not to limit the 
processing to one pass through the various strata. Specific codes, like LDPC 
[41] and TURBO codes, can be tested in such SD-BLAST simulations. 
Ultimately, the most promising codes could be tested over the air using the 
Crawford Hill Wireless Research Department prototype. 

It is worthwhile mentioning that we can conclude from [22-24] that 
LogDet can also be "attained" with separately coded 1-D signals radiating 
from each transmit antenna using codes that are designed for a standard 
AWGN environment. Each transmitter transmits a (possibly, depending on 
the channel) different rate. However, this requires that the M transmit rates 
be fed back to the transmitter. 

We stressed the random channel case. Suppose instead the channel is 
not random, just that it is unknown to the transmitter. In this case the 
transmitter, beside N, and of course M, is only given the capacity of the 
fixed but otherwise completely unknown channel it is to communicate over. 
The question is: What capacity can be achieved with SD-BLAST when the 
transmitter is so blind? For fixed energy per ply and a large number of plies 
this is a very tractable optimization problem. The associated issue of 
optimizing the energy levels in a given number of plies is an interesting one 
to pursue for both the random and nonrandom channel cases. 
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